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Abstract 

Achieving facile specific recognition is essential for intrinsically disordered proteins (IDPs) that are involved in cellular 
signaling and regulation. Consideration of the physical time scales of protein folding and diffusion-limited protein-protein 
encounter has suggested that the frequent requirement of protein folding for specific IDP recognition could lead to kinetic 
bottlenecks. How IDPs overcome such potential kinetic bottlenecks to viably function in signaling and regulation in general 
is poorly understood. Our recent computational and experimental study of cell-cycle regulator p27 (Ganguly ef al., J. Mol. 
Biol. (2012)) demonstrated that long-range electrostatic forces exerted on enriched charges of IDPs could accelerate 
protein-protein encounter via "electrostatic steering" and at the same time promote "folding-competent" encounter 
topologies to enhance the efficiency of IDP folding upon encounter. Here, we further investigated the coupled binding and 
folding mechanisms and the roles of electrostatic forces in the formation of three IDP complexes with more complex folded 
topologies. The surface electrostatic potentials of these complexes lack prominent features like those observed for the p27/ 
Cdk2/cyclin A complex to directly suggest the ability of electrostatic forces to facilitate folding upon encounter. 
Nonetheless, similar electrostatically accelerated encounter and folding mechanisms were consistently predicted for all 
three complexes using topology-based coarse-grained simulations. Together with our previous analysis of charge 
distributions in known IDP complexes, our results support a prevalent role of electrostatic interactions in promoting efficient 
coupled binding and folding for facile specific recognition. These results also suggest that there is likely a co-evolution of 
IDP folded topology, charge characteristics, and coupled binding and folding mechanisms, driven at least partially by the 
need to achieve fast association kinetics for cellular signaling and regulation. 
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Introduction 

Cellular signaling and regulation are frequently mediated by 
proteins that, in part or as a whole, lack stable structures under 
physiological conditions [1-3]. Such intrinsically disordered 
proteins (IDPs) are highly prevalent in proteomes [4] and over- 
represented in diseases pathways [5,6]. For example, nearly one- 
third of eukaryotic proteins have been predicted to contain 
extended disordered regions [7], and about 25% of disease- 
associated missense mutations can be mapped into predicted 
disordered regions [8] (although cancer mutations appear to prefer 
ordered regions [9]). The prevalence of intrinsic disorder suggests 
that protein conformational heterogeneity could provide crucial 
functional advantages, for which many concepts have been 
proposed [10-14]. Understanding the physical basis of how 
intrinsic disorder mediates protein function (and how such 
functional mechanism may fail in human diseases [15]) is of 
fundamental significance and has attracted intense interests in 
recent years [16]. Important progresses have been made on 
characterizing the conformational properties of unbound IDPs 
and determining how these conformational properties contribute 
to efficient and reliable interactions [16-22]. 



A key recent recognition is that frequent requirement of protein 
folding for specific recognition of IDPs could lead to kinetic 
bottlenecks [23-25]. As predicted by the dual-transition-state 
theory [23], the diffusion-limited encounter rate constant repre- 
sents the upper bound for that of a coupled binding and folding 
interaction. Importantly, the upper bound can be achieved only if 
the IDP readily folds upon encounter, which requires folding rates 
on the order of 10 (_is — 1 or greater [23]. That is, IDPs need to 
achieve folding rates beyond the typical \is 1 "speed limit" 
estimated for folding of isolated proteins [26] to maximize 
association kinetics. Therefore, the putative functional advantages 
of intrinsic disorder, especially structural plasticity for specific 
interactions with numerous partners [27], come with a potential 
cost of slow binding kinetics. Such kinetic bottleneck must be 
resolved for IDPs to be viable in cellular signaling and regulation. 
Interestingly, a recent survey of binding kinetic data revealed that 
IDP binding was not systematically slower than that of globular 
proteins [28]. The implication is that most IDPs do manage to fold 
rapidly upon nonspecific binding, and this is apparently consistent 
with the accumulating observations that IDP coupled binding and 
folding tends to follow induced folding-like baseline mechanisms 
(i.e., bind then fold) [16,19]. Several factors could contribute to 
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Author Summary 

Intrinsically disordered proteins (IDPs) are key components 
of regulatory networks that dictate various aspects of 
cellular decision-making. They are over-represented in 
major disease pathways, and are considered novel albeit 
currently difficult drug targets. Recognition of IDPs has 
extended the traditional protein structure-function para- 
digm, and various concepts have been proposed on how 
intrinsic disorder may confer crucial functional advantages. 
However, the physical basis of these concepts remains 
poorly established. In particular, while IDPs alone exist as 
ensembles of fluctuating structures, they frequently fold 
upon specific binding. Analysis of the physical timescales 
of protein folding and protein-protein encounter predicts 
that the requirement of peptide folding for specific 
binding could lead to a major kinetic bottleneck. In this 
work, carefully calibrated topology-based coarse-grained 
models were applied to directly simulate reversible folding 
and binding and investigate the recognition mechanisms 
of three IDP complexes. The results strongly support an 
electrostatically accelerated encounter and folding mech- 
anism, where long-range electrostatic forces not only 
accelerate protein-protein encounter via "electrostatic 
steering" but also promote "folding-competent" encoun- 
ter topologies to enhance the efficiency of IDP folding 
upon encounter. 

efficient folding of IDPs upon binding, in particular small 
interacting (and folding) domains and simple folded topologies 
with low contact orders. There also appears to be a delicate 
balance between pre-folding and conformational flexibility that 
allows an IDP to quickly fluctuate among accessible conforma- 
tional states, especially upon encounter [16,29,30]. Nonetheless, it 
is not yet clear how in general IDPs may achieve fast folding at 
rates beyond the traditional |Xs 1 folding "speed limit" upon 
encountering their specific targets. 

An important characteristics of IDPs is that they are enriched 
with charged and polar residues [31]. Electrostatics can thus be 
expected to play key roles in IDP structure and function. For 
example, the charge content can modulate compaction and other 
conformational properties of free IDPs [32,33]; DNA search 
efficiency is controlled by charge composition and distribution in 
disordered tails of DNA-binding proteins [34,35]. It has been 
also observed or speculated in a few cases that electrostatics 
might be important for fast IDP recognition [36-39] . However, 
these discussions have been often based on the classic 
electrostatic steering effects [40], and the actual underlying 
mechanisms of putative electrostatic acceleration were not 
known. Our recent computational and experimental study of 
the p27-Cdk2/cyclin A interaction revealed that long-range 
electrostatic forces could promote facile IDP recognition via an 
"electrostatically accelerated encounter and folding mecha- 
nism" [24]. Specifically, the measured p27/Cdk2/cyclin A 
association rate constants showed a strong salt-dependence, 
increased —12 fold when the ionic strength was reduced from 
0.6 to 0.075 M. However, the salt-dependence is poorly 
described by an approximate Debye-Hiickel relation [41] that 
mainly captures the electrostatic steering effects. Instead, 
simulations using a series of topology-based coarse-grained 
models suggested that long-range electrostatic forces exerted on 
a large number of charges on p27 did not only accelerate the 
encounter rate (via the classical electrostatic steering effect [40]), 
but enhance the efficiency of p27 folding upon encounter by 
promoting native-like encounter topologies. 



Analysis of surface charges in a set of existing IDP complexes 
further revealed that the vicinity of IDP binding sites tended to be 
enriched with charges to complement those on IDPs [24] (even 
though the IDP binding interface itself is more hydrophobic than 
the rest of the protein surface as previously observed [42]). 
Electrostatic forces are known to be a dominant long-range force 
that can guide protein orientation in protein-DNA interactions 
[43,44] and/or modulate early stages of protein folding [45-47]. 
One implication of enriched charges near IDP binding sites is thus 
that the electrostatically accelerated encounter and folding 
mechanism observed for p27 may be prevalent in signaling and 
regulatory IDPs. Nonetheless, the ability for long-range electro- 
static forces to enhance folding upon binding can be surprising, as 
nonspecific interactions (electrostatic or van der Waals) have been 
generally expected to accelerate binding but slow down folding 
[48,49]. It has also been predicted that, while inter-chain 
electrostatic interactions facilitate binding of disordered chaperone 
Chzl to histone variant H2A.Z-H2B, intra-chain electrostatic 
interactions could lead to premature collapse of Chz 1 under low 
salt conditions and hinder the overall rate of forming the specific 
complex [50]. 

In the present work, we investigated the recognition mecha- 
nisms and the roles of long-range electrostatic interactions in 
forming of three IDP complexes, namely, p53-TADl/TAZ2, 
HIF-lot/TAZl, and NCBD/ACTR (Table 1). All these complexes 
have important biological functions. For example, tumor suppres- 
sor p53 is considered one of the most important proteins in cancer 
[51]; NGBD and TAZ1/2 are key regulatory domains of CBP, a 
key component of the general transcriptional machinery that plays 
critical roles in cell fate regulation [52]. For understanding IDP 
recognition, these systems involve more complex folded topologies 
than that of p27 in the p27/Cdk2/cyclin A complex. As shown in 
Fig. 1, both HIF-la/TAZl and NCBD/ACTR possess extensive 
binding interfaces, whereas the binding interface in p53-TADl/ 
TAZ2 is more localized. Importandy, while strong charge 
complementary exists near the binding interface (as expected), 
the surface electrostatic potentials of the folded substrates do not 
show prominent features like those observed on Cdk2/cyclin A 
(e.g., see Fig. 1 of reference [24]) to directly suggest that long-range 
electrostatic forces could promote native-like (and thus more 
folding-competent) encounter complexes. The NCBD/ACTR 
complex involves synergistic folding of two IDPs and thus offers 
a particularly interesting opportunity to understand whether and 
how electrostatic interactions may modulate the formation of 
nontrivial folded topologies. Amazingly, all three complexes 
associate with on-rates in excess of 10 7 M 's 1 (see Table 1), a 
regime that is typically considered "diffusion-limited" and can 
only be accessed in the limit of ultrafast conformational transitions 
[40]. 

Results 

Topology-based modeling of IDP coupled binding and 
folding 

Series of topology-based coarse-grained models were first 
derived based on the complex structures to allow direct simulation 
of reversible binding and folding with tractable computational 
cost. Topology-based modeling is based on the theoretical 
framework of minimally frustrated energy landscapes for natural 
proteins [53], and has been highly successful in predicting essential 
features of protein folding mechanisms [53-55]. Formation of 
stable IDP complexes such as those studied in this work should also 
satisfy minimal frustration, and thus topology-based modeling is 
applicable. Indeed, it has been successfully applied to several IDP 
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Table 1. Key properties of three IDP complexes. 





Name* 


Length 




*„„ (NT's"') 


PDB 


IDP Fold 


Charges' 


p53-TAD1/TAZ2 


39/90 


2.7 uM [77] 


~10 8d 


2k8f 


helix/loops 


8(-6), 9(+9), 4(+2) 


HIF-la/TAZI 


51/99 


7 nM [70] 


1.3 xlO 9 [78] 


118c 


helices/loops 


11(-5), 11(+7), 10 
(+5) 


NCBD/ACTR 


59/47 


34 nM [79] 


3x10 7 [65] 


1kbh 


helices 


- (both IDPs) 



Abbreviations: ACTR: the activation domain of pi 60 steroid receptor co-activator; HIF-1 a: hypoxia-inducible factor 1 ot subunit; NCBD: the nuclear-receptor co-activator 
binding domain of CREB binding protein (CBP); p53-TAD1 : the transactivation domain 1 of tumor suppressor p53; TAZ1 12: the TAZ domains of CBP. The sequences of all 
IDPs involved (highlighted in bond fonts) are provided in the Supporting Information. Text SI. 

b The experimental K D values were measured at 308 K for p53-TAD1/TAZ2, 298 K for HIF-la/TAZI, and 304 K for NCBD/ACTR. Note that K D only weakly depends on 
temperature for p53-TAD1/TAZ2 (doubled when the temperature is increased from 288K to 308K [77]). 

'Numbers of charged residues and the net charges (in parentheses) of the IDP, its binding site, and the vicinity of the binding site. Residues at the IDP binding interface 
are identified as those with greater than 1.0 A 2 solvent accessible surface area changes upon complex formation. Surface residues are identified as those with >5% 
solvent accessibility. All surface residues within 1 5 A Ca-Ca distance from the bound IDP but not directly involved in intermolecular contacts are considered to be within 
the vicinity of the IDP binding site. 

d Estimated based on the association rate constant of p53-TAD2/TAZ2 (~10 10 IVrV 1 [38]), assuming that TAD1 and TAD2 have similar off rates. TAD2 binds to the TAZ2 

primary site with K D —32 nM [38], about two orders of magnitude stronger than TAD1. 

doi:10.1371/journal.pcbi.1003363.t001 




Figure 1. Structures and surface electrostatic potentials of three complexes. A) p53-TAD1/TAZ2, B) NCBD/ACTR, and C) HIF-1a/TAZ1.TAZ2, 
NCBD and TAZ1 are shown in molecular surface and colored based on the surface electrostatic potential calculated using PBEQ module of CHARMM 
[80,81], Red indicates negative and blue indicate positive charge. p53-TAD1, ACTR and HIF-1 a are shown in cartoons, with charged side chains shown 
in stick. 

doi:10.1371/journal.pcbi.1003363.g001 
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Table 2. Dissociation constants, melting temperatures, average reversible coupled binding and folding transition rates calculated 
using various coarse-grained models with and without explicit charges and/or 0.05 M salt. 





Models 


Calc. Ko 


7" m (K) 


*TS (ITS" 1 ) 


Acap (ns 


Aesc (ns 


Aevo (ns 


TAD1/TAZ2 


No charge 


1.4±2.0 \M 


327 


4.3 ±1.5 


1.4 


8.0 


0.049 


Charged, 0.05M salt 


1.6+1.6 |iM 


340 


145+1.1 


3.2 


4.1 


0.08 


Explicit charges 


4.9 ±3.2 |xM 


335 


27.0±0.2 


32.1 


0.10 


0.16 


HIF-la/TAZI 


No charge 


64±64 nM 


327 


6.1 ±0.5 


2.8 


6.3 


0.022 


Charged, 0.05M salt 


9.4±9.6 nM 


340 


10.2±1.8 


3.4 


5.6 


0.039 


Explicit charges 


1 .3 ± 1 .6 nM 


345 


29.4±3.7 


5.0 


0.69 


0.048 


NCBD/ACTR 


No charge 


67±99 nM 


318 


0.53 ±0.2 


0.13 


0.61 


0.0043 


Charged, 0.05M salt 


96±92 nM 


315 


1.7±0.1 


0.29 


0.31 


0.0074 


Explicit charges 


39±14 nM 


322 


5.2 ±0.7 


0.79 


0.020 


0.012 



K D was calculated from REX simulations at 300 K(see Table 1 for the experimental values); k JS was calculated from the production Langevin simulations at the 
corresponding T m , as k JS = N JS /t xov where A/ Xs is the number of reversible binding and folding transitions observed during the total simulation time span f tot . As all 
simulations were performed at T m , k JS as defined is half of the binding and unbinding rates. k cap /c esc and k evo are defined in Eqns. 1-4. The effective concentrations of 
these simulations are 1.66 mM, 1.66 mM and 1.43 mM for p53-TAD1/TAZ2, HIF-la/TAZI and NCBD/ACTR, respectively. All uncertainties were estimated as the 
differences between results calculated from the first and second halves of the data. 
doi:1 0.1 371 /joumal.pcbi.1 003363.t002 



complexes [56-60], with many key predictions substantiated by 
independent experimental studies. Nonetheless, important differ- 
ences do exist between IDPs and structured proteins in sequence 
compositions and binding interface characteristics [42]. We have 
previously demonstrated that traditional topology-based models 
need to be carefully calibrated to ensure proper balance among 
competing intramolecular and intermolecular interactions (see 
Methods for detail on the calibration protocol) [61]. We note that 
the importance of model calibration was also illustrated in a recent 
study of the HIF-la/TAZI complex [59]. 

Table 2 summarizes the final calibrated models for all three 
complexes. The calculated residual helicity distributions of the 
unbound states are show in Fig. SI. Three independent models 
were constructed for each complex: one without explicit charges 
(mimicking high salt concentration with fully screened long-range 
electrostatic interactions), one with explicit charges (mimicking low 
salt concentration with unscreened long-range electrostatic inter- 
actions), and a third one with explicit charges and 0.05 M salt 
(mimicking physiological conditions). All models reproduce the 
experimental to the same order of magnitude, except that the 
no charge model for HIF-la/TAZI yields a K^, value about one 
order of magnitude too large. We note that calculated K D values 
can be very sensitive to small changes of in the scaling of 
intermolecular interactions during model calibration (see Meth- 
ods). It is computationally expensive to use REX simulations to 
systematically search for the parameter space, especially for 
models without explicit charges due to slower transitions. 
Nonetheless, by performing production simulations at the corre- 
sponding melting temperatures, remaining imperfections in the 
balance of various interactions should be further suppressed, 
allowing reliable comparative studies of the mechanistic roles of 
electrostatic interactions in coupled binding and folding. 

Baseline mechanisms of coupled binding and folding: 
Effects of electrostatic forces 

Free energy surfaces were constructed using various combina- 
tions of folding and binding order parameters to understand the 



baseline mechanisms of coupled binding and folding and to dissect 
the effects of long-range electrostatic forces. In particular, the 
fractions of native contacts formed have been shown to provide 
natural reaction coordinates for such mechanistic analysis [62]. 
Fig. 2 compares the free energy surfaces as a function of intra- and 
inter-molecular native contact factions for all three complexes, 
calculated using calibrated Go-like models with and without 
explicit charges and/or salt (see Table 2). Both p53-TADl and 
HIF-lot recognitions follow induced folding-like mechanisms, 
where the peptides only gain structures after forming significant 
numbers of native intermolecular contacts. For example, Fig. 2A 
shows that p53-TADl does not start to fold until Q nter reaches 
~0.5. Free NCBD is a molten globule with folded-like secondary 
structures [63], and its synergistic folding with ACTR has been 
previously shown to involve multiple stages of selection and 
induced folding [25,60], reminiscent of the "extended conforma- 
tional selection" mechanism [30]. Nonetheless, neither protein 
gains significant secondary (for ACTR) or tertiary (for NCBD) 
structures until over 20% of native intermolecular contacts are 
formed (Fig. 2G and 2J). 

Interestingly, formation of all three complexes involves inter- 
mediates, even though the intermediate in p53-TAD/TAZ2 
interaction only become pronounced in the presence of nonspe- 
cific electrostatic forces (see Fig. 2A vs 2C). Detailed examination 
of the simulation trajectories and various free energy surfaces using 
fractions of native contacts formed by different IDP segments (e.g., 
see Figs. S2, S3, S4) revealed the existence of multiple parallel 
pathways for forming HIF-la/TAZI and NCBD/ACTR. While 
these mechanistic details are not the focus of the current work, 
they appear to be highly consistent with previous experimental 
and computational studies. For example, as shown in Fig. S2, both 
the first and third helices of HIF- 1 ot could initiate recognition, with 
the pathway initiated by the third helix binding being much more 
prevalent. Similar observations were also made in a separate 
computational study [59]. Specific recognition of NCBD/ACTR 
appears to be primarily initiated by the C-terminal segments of 
these two peptides (Figs. S3, S4), which forms a key intermediate 
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Figure 2. Free-energy surfaces at 7~ m as a function of the fractions of intra- and intermolecular contacts formed, computed using 
various Go-like models with and without explicit charges and/or 50 mM salt (see Table 2). Rows A-C, D-F and G-L are for the p53-TAD1/ 
TAZ2, HIF-1a/TAZ1 and NCBD/ACTR complexes, respectively. Q inter is the fraction of intermolecular contacts formed; Q p5 3, QmF-ia and Qactr are the 
fractions of intramolecular contacts formed by p53-TAD1, HIF-1oc and ACTR, respectively; QNCBD-tert is the fraction of tertiary intramolecular contacts 
formed by NCBD (the helical content of NCBD remain similar during coupled binding and folding). Contours are drawn every kT, where k is 
Boltzmann constant and T is the absolute temperature. 
doi:10.1371/journal.pcbi.1003363.g002 



that was also suggested by an H/D exchange mass spectrometry 
study [64]. Kinetic data from a recent stop-flow study of the 
NCBD/ACTR interaction [65] are consistent with the prediction 
of induced folding as a baseline mechanism and have further 
confirmed the existence of parallel pathways and multiple folding 
intermediates. Representative snapshots along the dominant 
binding and folding pathways of p53-TADl/TAZ2 and HIF- 
la/TAZl are shown in Figs. S5, S6. 

Explicit inclusion of charges does not significantly perturb the 
baseline mechanisms of coupled binding and folding. As shown in 
Fig. 2 and Figs. S2, S3, S4, long-range electrostatic forces do not 
lead to fundamental changes in any of the free energy surfaces 



examined. The baseline mechanisms for the formation of all three 
complexes remain induced folding-like. Furthermore, nonspecific 
electrostatic interactions do not change the relative prevalence of 
the parallel pathways that exist. For example, HIF-loc still initiates 
binding mainly through the third helix (Fig. S2); synergistic folding 
NCBD and ACTR is still mainly initiated through their C- 
terminal segments (Figs. S3, S4). The key effect of electrostatic 
forces appears to be substantial reductions in the free energy 
barriers that separate various basins. That is, even under the no 
salt condition, strong nonspecific electrostatic interactions do not 
appear to add to the ruggedness of coupled binding and folding 
free energy surfaces. An implication is that there exists a level of 



PLOS Computational Biology | www.ploscompbiol.org 



5 



November 2013 | Volume 9 | Issue 11 | e1 003363 



Electrostatically Accelerated Recognition of IDPs 



self-consistency between the charge distribution and folded 
topology in the bound states, despite a lack of apparent 
complementary between folding topologies and surface electro- 
static potentials for these IDP complexes (see Fig. 1). 

Kinetic effects of long-range and nonspecific electrostatic 
forces 

Kinetics of coupled binding and folding was derived directly 
from production Langevin dynamics simulations performed using 
the calibrated Go-like models at their corresponding T m . The 
results, summarized in Table 2, show that long-range electrostatic 
forces accelerate the reversible binding/ unbinding transition rates 
for all three complexes. The overall electrostatic acceleration, 
estimated by comparing the average transition rates (&rs) 
calculated using models with and without explicit charges, ranges 
from ~5 fold for HIT- la to 10 fold for NCBD/ACTR. The 
magnitude of acceleration is similar to what was previously 
measured for other IDPs including p27 [24] and PUMA [39] (both 
~ 10 fold). The presence of 0.05 M salt significantly attenuates the 
predicted electrostatic acceleration, to only about two fold. 
However, the effect of salt screening on electrostatic acceleration 
is likely over-predicted [24], which is due to the C a -only model 
used in this work and may be corrected with more detailed protein 
models [45]. Consistent with the kinetic analysis, there are 
significant reductions in the free energy barriers along Q^ lter (see 
Fig. 3), which has been shown to be a good binding reaction 
coordinate [61]. In addition, the magnitude of barrier reduction 
correlates well with the degree of rate acceleration calculated 
directly from Langevin dynamics simulations, with the largest 
barrier reduction observed for NCBD/ACTR and the smallest 
reduction observed from HIF-lot/TAZl. 

To further analyze the effects of electrostatic interactions on 
different stages of coupled binding and folding, the recognition 
process was divided into two generic steps, including an encounter 
step followed by an evolving (folding) step to final bound and 
folded state (Eq. 1 in Methods). Such generic decomposition 
ignores the details of IDP-specific folding pathways, to allow on to 
focus on the net effects of electrostatic forces on the overall 
efficiency of IDP folding upon encounter. For this, three general 
states were identified during production simulations, including the 
unbound (U), collision complex (CC), and bound (B) states (see 
Methods for specific criteria for state assignment). The mean first 



passage times (MFPT) and numbers of transitions (N tra „) among 
these states were then calculated. The results, summarized in 
Tables SI, S2, S3, show that long-range electrostatic forces gready 
reduce the average encounter time, from 0.72 to 0.03 ns for p53- 
TAD, from 0.37 to 0.20 ns for HIF-lot, and from 7.71 to 1.26 ns 
for NCBD. At the same time, long-range electrostatic forces also 
significandy enhance the efficiency of IDP folding upon encounter, 
allowing much larger fractions of the encounter complexes to 
eventually evolve to the bound states. For example, for NCBD/ 
ACTR, only 16 out ~2300 encounter events evolved to the bound 
state in absence of long-range electrostatic forces (0.7%); whereas 
with explicit charges, there was ~37% probability (108 out of 288) 
of forming the specific complex once the proteins were captured 
into the collision complex state (Table S3). For the HIF-lot/TAZl 
complex, the percentages of collision to specific complex transition 
are 0.4% without and 7% with explicit charges (Table S2); for 
p53-TADl/TAZ2, the production percentages are 0.6% without 
and 60% with explicit charges (Table SI). It should be emphasized 
that nonspecific electrostatic interactions significantly stabilize the 
collision complexes, due to large and complementary net charges 
of the interacting proteins (see Table 1). As such, much fewer fully 
unbinding events were observed during production simulations 
using the charged models. This effect also led to more reversible 
transitions between the bound and collision complex states and 
thus an overestimation of the true folding efficiency of IDPs upon 
collision as estimated above. We also note that the collision 
complexes as defined in our analysis were not intended to 
represent so-called "encounter complexes" that have been often 
considered key intermediates of protein-protein association [66], 
although encounter complexes are also believed to be mainly 
stabilized by nonspecific electrostatic interactions. 

The enhanced apparent efficiency of folding upon encounter 
appears to be frequently achieved at the cost of longer folding 
times. For example, the MFPTs of transitions from the collision 
complexes to the bound states increase from 0.26 to 3.94 ns for the 
p53-TADl/TAZ2 complex (Table SI) and from 8.14 to 44.56 ns 
for the NCBD/ACTR complex (Tables S3). The net effects on the 
kinetics of encounter and folding stages can be quantified by 
calculating three effective rate constants as defined in Eqns. 2-4 
(see Methods) [28]. The results, summarized in Table 2 and 
plotted in Fig. 4, clearly demonstrate that nonspecific electrostatic 
interaction enhance the encounter rates and reduce the escape 



w/o Charge 



w/ charge and 50 mM salt 




w/ charge 




inter 



nter 



inter 



Figure 3. Free energy as a function intermolecular contact fraction at 7~ m . These profiles were calculated from the REX simulations using 

WHAM for: A) TAD1/TAZ2, B) HIF-1ot/TAZ1, and C) NCBD/ACTR. 

doi:10.1371/journal.pcbi.1003363.g003 
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w/ charges 0.05M salt w/o charges w/ charges 0.05M salt w/o charges w/ charges 0.05M salt w/o charges 

Figure 4. Effective rate constants for transitions between the unbound, collision complex and bound states. The rates, as defined in 
Methods Eqns. 1-4, were calculated using models with and without explicit charges and/or 50 mM salt for: A) TAD1/TAZ2, B) HIF-1a/TAZ1 and C) 
NCBD/ACTR. The results demonstrate that long-range electrostatic forces increase both the capture and evolution rates and at the same time reduce 
the escape rates. 

doi:10.1371/journal.pcbi.1003363.g004 



rates of the collision complexes. Importantly, the effective 
evolution rates are always faster, by about three fold, in the 
presence of long-range electrostatic forces, despite longer MFPTs 
for the transitions from the collision complexes to the bound state 
observed for the p53-TADl/TAZ2 and NCBD/ACTR complex- 
es. The magnitude of electrostatic acceleration of folding upon 
encounter is similar to what was previously observed for folding 
and binding of p27 to the Cdk2/cyclin A complex [24]. 

Mechanism of electrostatically accelerated folding upon 
encounter 

Inspection of the conformational properties of the collision 
complexes provides further insights into the molecular basis for 
enhanced efficiency of IDP folding upon encounter due to long- 
range electrostatic forces. As shown in Fig. 5, without nonspecific 
electrostatic interactions (models without explicit charges), the 
initial contacts between two binding partners are largely random, 
and the distributions of IDP initial contact points on the substrate 
surface in the collision complexes are relatively uniform (left 
column). In contrast, with the inclusion of explicit charges, the 
probabilities of IDP encountering near the native binding interface 
are dramatically increased. Coupled with reduced escape rates, 
this allows much higher efficiency of IDP folding upon encounter 
to achieve higher overall association rate constants (Table 2). The 
ability of long-range electrostatic forces to guide the recognition 
process is also reflected in the free energy surfaces as a function of 
binding RMSD of the IDP and center of mass separation between 
two peptides. As shown in Fig. 6, long-range electrostatic forces 
generate a strong free energy gradient that extends over 10-15 A 
away from the native bound positions, without creating over- 
stabilized misfolded states at short separation distances. It is 
intriguing that, even though both NCBD and ACTR are 
disordered in the unbound state, nonspecific long-range electro- 
static forces between complementary charges on these two proteins 
can still manage to promote native-like topologies in the collision 
complexes. In particular, there is a much higher probability of 
NCBD and ACTR initiating contacts via the C-terminal helix of 
NCBD and the second helix of ACTR (Fig. 5E-F). This is part of a 
key pathway of synergistic folding inherent to the NCBD/ACTR 
complex that was predicted by coarse-grained and atomistic 
simulations [25,60] and later substantiated by H/D exchange 
mass spectrometry [64]. Therefore, nonspecific electrostatic 
interactions appear to mainly augment existing folding pathways 



inherent to the folded topologies to facilitate efficient folding of 
IDPs upon encounter. Coupled with the previous observation that 
the vicinity of the IDP binding site tends to be enriched with 
charges to complement those on IDPs [24], thee current results 
suggest that there is likely a co-evolution of IDP folded topology, 
charge characteristics, and coupled binding and folding mecha- 
nisms. Furthermore, the co-evolution is likely driven by the 
important need to achieve facile IDP recognition for cellular 
signaling and regulation. 

Discussion 

While fulfilling important functional constraints such as 
structural plasticity for binding numerous specific targets, protein 
intrinsic disorder can lead to potential kinetic bottlenecks to be 
viable in cellular signaling and regulation. Our previous work on 
the p27/ Cdk2/ cyclin A complex has revealed a mechanism where 
nonspecific electrostatic interactions not only enhance the protein- 
protein encounter kinetics but also promote folding-competent 
encounter topologies to increase the efficiency of IDP folding upon 
encounter [24]. Using carefully calibrated topology-based coarse- 
grained models, we have now further demonstrated that similar 
electrostatically accelerated encounter and folding mechanisms 
also underlie the formation of three IDP complexes with more 
complexed folded structures, namely, p53-TADl/TAZ2, HIF- 
la/TAZl, and NCBD/ACTR. Importantly, these complexes lack 
apparent features on the electrostatic surface potentials to directly 
suggest the ability of nonspecific long-range electrostatic forces to 
promote native-like encounter topologies to enhance the IDP 
folding efficiency upon encounter. Nonetheless, there seems to 
exist a sufficient level of self-consistency between the charge 
distributions and folded topologies in the bound state to allow 
accelerated recognition in presence of nonspecific electrostatic 
interactions. Therefore, enriched charges on IDPs not only play 
key roles in modulating the conformational properties of the 
unbound state, but also likely play general and important roles in 
regulating efficient interactions of IDPs with specific partners. We 
note that IDPs are frequently regulated by post-translational 
modifications that add or remove charges. Improved mechanistic 
understanding of electrostatic forces in IDP recognition derived 
from the current work will thus help to dissect the profound 
impacts of post-translational modifications and disease-related 
mutations on IDP structure and interaction. 
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10 20(%) 



Figure 5. Distributions of IDPs on the substrate surfaces in the 
collision complexes derived from simulations using models 
with and without explicit charges. For the p53-TAD1/TAZ2 (A-B) 
and HIF-1a/TAZ1 (C-D) complexes, TAZ2 and TAZ1 are colored based 
on the probability of each residue in contact with the IDPs in the 
collision complex ensembles, and p53-TAD1 and HIF-lct are shown only 
in the fold and bound conformations (yellow cartoon) for reference. For 
the NCBD/ACTR complex (E-F), both IDPs are shown in the bound and 
folded conformations and colored based on the probability of each 
residue involved (nonspecific) intermolecular contacts in the collision 
complex ensemble. 
doi:1 0.1 371/joumal.pcbi.l 003363.g005 

Methods 

Calibration of topology-based coarse-grained models 
with and without explicit charges 

C a -only sequence-flavored Go-like models [67] were first 
derived from the complex structures of p53-TADl/TAZ2, 
HIFl-a/TAZl and NCBD/ACTR (see Table 1) using the 
Multiscale Modeling Tools for Structural Biology (MMTSB) Go- 
Model Builder (http://www.mmtsb.org) [68]. The 3 zinc ions 
bound to TAZ1 in the HIFl-a/TAZl complex were modeled 
explicidy with distance restraints to the coordinating residues. All 
three models were then calibrated to balance the intrinsic folding 
propensity and the strength of intermolecular interactions using a 
previously described protocol [61]. Briefly, the strengths of intra- 
molecular native contact were uniformly scaled to reproduce the 



experimentally measured residual helicity of unbound IDPs, which 
are mainly based on NMR secondary chemical shift and/or 
circular dichroism analysis (p53-TADl [69], NCBD/ACTR [63], 
and HIFl-a [70]). The residual helicity distributions calculated 
using the final models listed in Table 2 are provided in Fig. SI. 
Then, the strengths of intermolecular contacts were adjusted, such 
that binding affinities calculated from replica exchange molecular 
dynamics (REX-MD) simulations approximately match the 
experimental values (see Table 1). Following the previously 
described procedure [24], the calibrated sequence-flavored Go- 
like models were then further modified by assigning proper explicit 
charges to all charged residues (Lys, Arg, Glu and Asp) as well as 
zinc ions in the HIFl-oc/TAZl complex. The charged models 
were then re-calibrated to reproduce the experimental residual 
structure level (Fig. SI) and binding affinity (Table 2). Such 
calibration is critical to avoid inherent bias for particular types of 
interactions, e.g., intra- vs. inter-molecular or native vs. nonspe- 
cific electrostatic. Nonspecific electrostatic interactions were 
modeled using the Debye-Htickel potential to account for ionic 
screening. The dielectric constant was set at 80. 

Simulation protocols 

The complexes were simulated in cubic boxes with periodic 
boundary conditions imposed in CHARMM [71,72]. The box 
sizes are 100, 100 and 105 A for p53-TADl/TAZ2, HIF-la/ 
TAZ1 and NCBD/ACTR, respectively. Langevin dynamics was 
performed with 15 fs time steps and a friction coefficient of 
0.1 ps _1 . SHAKE was used to fix all virtual bond lengths [73]. 
Non-bonded interactions were cut off at 25 A. Unbound IDPs 
were simulated at 300 K for 750 ns to calibrate the intramolecular 
interactions. REX-MD was performed using the MMTSB Toolset 
[68] for calibration of the intermolecular interactions. For this, 
eight replicas spanning 270 to 400 K were used. The lengths of 
REX calibration simulations ranged from 1 .05 u.s (for p53-TAD 1 / 
TAZ2) up to 10 u,s (for NCBD/ACTR), as needed for achieving 
sufficient convergence. Temperature weighted histogram analysis 
method (WHAM) [74] was used to compute the heat capacity (Cy) 
curves and generate unbiased probability distributions for free 
energy and thermodynamic analysis. In particular, the dissociation 
constants (K D ) were calculated from the bound and unbound 
probabilities at 300 K [61], where the unbound state was defined 
as the state without any native intermolecular contacts formed. For 
NCBD/ACTR complex, the ID free energy profile lack 
significant barriers between the unbound and partially bound 
intermediate states (Fig. 3C, red trace). Therefore, the unbound 
probability was calculated as 1 - P| )()ml d, where /"bound is the bound 
probability (see below for the specific criteria of state assignments). 
Once calibrated, production simulations of 30-40 |J,s in lengths 
were performed using all models at the corresponding 7m's (see 
Table 2). The T M value was first identified based on the C v curve 
and then fine tuned to ensure that similar probabilities of sampling 
the bound and unbound states were observed in the production 
simulation. 

Free energy and kinetic analysis 

All free energy profiles were calculated from the REX 
simulations and the kinetic analysis was performed based on the 
production simulations, unless otherwise stated. For calculation of 
contact fractions, a given native contact was considered as formed 
if the inter-Ca distance was within 1.0 A of the distance in the 
native complex. Nonspecific intermolecular contacts are consid- 
ered as formed when the inter-Ca distance is within 10 A cutoff. 
Three general conformational states were defined for each 
complex, including the unbound (U), collision complex (CC) and 
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A No charge g Explicit charges, 50 mM salt q Explicit charge 




Figure 6. Free-energy surfaces at T m as a function of binding RMSD of the IDP and center of mass separation between two peptides 
|A CM ), computed using various Go-like models with and without explicit charges and/or 50 mM salt (see Table 2). The binding RMSD 
(of the IDP) was calculated by first aligning the snapshot with respect to the folded structure using only the folded substrate. For NCBD/ACTR, both 
proteins are IDPs and the (regular) RMSD was calculated using the whole complex. Rows A-C, D-F and G-l are for the p53-TAD1/TAZ2, HIF-1a/TAZ1 
and NCBD/ACTR complexes, respectively. Contours are drawn every kT. 
doi:10.1371/journal.pcbi.1003363.g006 



bound (B) states, to understand the effects of electrostatic forces on 
protein-protein encounter and subsequent folding upon encounter. 
The unbound state includes conformations with no specific or 
nonspecific contacts formed between IDP and substrate, and the 
collision complex state includes conformations with at least one 
nonspecific but no specific intermolecular contact formed. The 
bound states are defined as following: 1) for p53-TADl/TAZ2: 
JV int ,,>ll; 2) for HIF-lot/TAZl: ^U 3 >26 for the no charge 
model, .A^ter— 23 for the charged model, and jV; nter &24 for the 
charged model with 0.05 M salt; 3) for ACTR/NCBD: ^ inter >30. 
jVjnter is the total number of native intermolecular contacts formed. 
Note that slightly different criteria were used to define the bound 
state of HIF-lot/TAZl due to small shifts of the bound free energy 
basins calculated using different models (see Fig. 3). 15-ps running 
averages were used for assigning states, to avoid including fictitious 
transitions due to rapid small fluctuations in the calculated contact 
counts (especially between the U and CC states). The overall on 
and off rates were calculated directly from the average lifetimes of 
the bound and unbound states (see Table S4). In addition, MFPTs 
and numbers of transitions among all three states were derived 
from the production simulation trajectories, and various rates were 
calculated as defined in Eqns. 2-4. 



kcap/kesc k evo 

U< >CC >5 (1) 



k cap = MFPT^ (2) 



k esc = [{MFPT esc xN esc + MFPT em xN em )/(N esc + N em )\ 



k em = [{MFPTesc x N esc + MFPT em x N em )/(N esc + N em )} 1 

N em (4) 
x 

N 4-N 

1 v esc ^ ly evo 

Here, A (:ap , k f . sc , and k,. m are the capture, escape (to the unbound 
state) and evolution (to the bound state) rates of the collision 
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complex, respectively; N esc and jV evo are the numbers of escape and 
evolution transitions. Note that the MFPTs calculated correspond 
to the average times spent in an initial state before a transition to 
the final state. Ideally, the average lifetime of CC should be 
independent of whether the trajectory ends up in either the U or B 
state for a true three-state model as shown in Eq. 1 . However, the 
actual transitions between the CC and B states involve several 
intermediates that are not represented in Eq. 1, and the effective 
MFPTs as calculated thus depend on both the initial and final 
states (e.g., see Tables SI, S2, S3). Analytical expressions on 
similar MFPTs involved in amyloid fibril templating can be found 
a recent theoretical analysis by Schmit [75]. All molecular 
visualizations were prepared using VMD [76] . 

Supporting Information 

Figure S 1 Residual helicities of (a) p5 3-TAD 1 , (b) Hlf- 1 a, and (c) 
ACTR in the unbound states calculated using different Go-like 
models. The solid traces correspond to models without explicit 
charges and the dashed traces are from the charged models. The 
black traces were computed from models with no adjustment of the 
intramolecular interaction strengths (i.e., scale = 1.0), which signif- 
icandy over-stabilized the helices. The red traces were calculated 
using the final calibrated models with optimal scaling of intramo- 
lecular interactions (see Table 2 of the main text). The residual 
helicity showed minimal dependence on the salt concentration for all 
peptides and the corresponding profiles are thus not shown. 
(TIF) 

Figure S2 2D free energy surfaces at T m calculated using models 
with (panels A, C, and E) and without explicit charges (panels B,D, F) 
(see Table 2 of the main text). Q™l AxA and are the 

fractions of native intermolecular contacts formed by the first and 
third helices of HIF-la, respectively. Rcm is the distance between the 
centers of mass of HIF- 1 oc and TAZ 1 . Contours are drawn every kT. 
(TIF) 

Figure S3 2D free energy surfaces at T m calculated using models 
with (panels A, C, and E) and without explicit charges (panels B,D, 
F) (see Table 2 of the main text). e^J R " H1 ,2to e ™" H2 and 
2into R H3 are tne fractions of native intermolecular contacts 
formed by the first, second and third helices of ACTR, 
respectively. Contours are drawn every kT. 
(TIF) 

Figure S4 2D free energy surfaces at T m calculated using models 
with (panels A, C, and E) and without explicit charges (panels B,D, 
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